Automated PDF highlighting to support faster curation of literature for Parkinson’s and Alzheimer’s disease

نویسندگان

  • Honghan Wu
  • Anika Oellrich
  • Christine Girges
  • Bernard de Bono
  • Tim J. P. Hubbard
  • Richard J. B. Dobson
چکیده

Neurodegenerative disorders such as Parkinson's and Alzheimer's disease are devastating and costly illnesses, a source of major global burden. In order to provide successful interventions for patients and reduce costs, both causes and pathological processes need to be understood. The ApiNATOMY project aims to contribute to our understanding of neurodegenerative disorders by manually curating and abstracting data from the vast body of literature amassed on these illnesses. As curation is labour-intensive, we aimed to speed up the process by automatically highlighting those parts of the PDF document of primary importance to the curator. Using techniques similar to those of summarisation, we developed an algorithm that relies on linguistic, semantic and spatial features. Employing this algorithm on a test set manually corrected for tool imprecision, we achieved a macro F 1 -measure of 0.51, which is an increase of 132% compared to the best bag-of-words baseline model. A user based evaluation was also conducted to assess the usefulness of the methodology on 40 unseen publications, which reveals that in 85% of cases all highlighted sentences are relevant to the curation task and in about 65% of the cases, the highlights are sufficient to support the knowledge curation task without needing to consult the full text. In conclusion, we believe that these are promising results for a step in automating the recognition of curation-relevant sentences. Refining our approach to pre-digest papers will lead to faster processing and cost reduction in the curation process. Database URL https://github.com/KHP-Informatics/NapEasy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Study of the foundation, models and issues of research data curation and management in scientific and academic environments

Background and Aim: The purpose of this paper is to study, identifying and discuss the foundation and concepts, models and frameworks, dimensions and challenges of research data curation and management in scientific and academic environments. Method: This article is a review article and library method was used to collect scientific and research texts in this field. In this research, external an...

متن کامل

A Curation Pipeline and Web-Services for PDF Documents

The continuous growth of the biomedical literature and the need to efficiently find and extract information from its content led to the development of various text mining tools. More recently, these tools started being integrated in user-friendly applications facilitating their use by expert database curators. However, these tools were mainly designed to extract information from text based docu...

متن کامل

Theoretical study of structure spectral properties of Tacrine as Alzheimer drug

Tacrine (9-amino-1,2,3,4-tetrahydroacridine) as a reversible inhibitor of acetylcholinesterase (AChE),was the first drug for the symptomatic treatment of Alzheimer’s disease (AD). NMR structuredetermination still presents some considerable challenges: the method is limited to systems ofrelatively small molecular mass, data collection times are long, data analysis remains a lengthyprocedure, and...

متن کامل

Uptake index of 123I-metaiodobenzylguanidine myocardial scintigraphy for diagnosing Lewy body disease

Objective(s): Iodine-123 metaiodobenzylguanidine (123I-MIBG) myocardial scintigraphy has been used to evaluate cardiac sympathetic denervation in Lewy body disease (LBD), including Parkinson’s disease (PD) and dementia with Lewy bodies (DLB). The heart-tomediastinum ratio (H/M) in PD and DLB is significantly lower than that in Parkinson’s plus syndromes and Alzheimer’s disease. Although this ra...

متن کامل

Recommendations to standardize Pre-analytical confounding factors in Alzheimer’s and Parkinson’s disease CSF biomarkers: an update

Early diagnosis of neurodegenerative disorders as Alzheimer’s or Parkinson’s disease (AD and PD) is needed to slow down or halt the disease at the earliest stage. Cerebrospinal fluid (CSF) biomarkers can be a good tool for early diagnosis. However, their use in clinical practice is challenging due to the high variability found between centers in the concentrations of both AD CSF biomarkers (Aβ4...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 2017  شماره 

صفحات  -

تاریخ انتشار 2017